---
title: "AzureOpenAIChatGenerator"
id: azureopenaichatgenerator
slug: "/azureopenaichatgenerator"
description: "This component enables chat completion using OpenAI’s large language models (LLMs) through Azure services."
---

# AzureOpenAIChatGenerator

This component enables chat completion using OpenAI’s large language models (LLMs) through Azure services.

<div className="key-value-table">

|  |  |
| --- | --- |
| **Most common position in a pipeline** | After a [ChatPromptBuilder](../builders/chatpromptbuilder.mdx)  |
| **Mandatory init variables** | `api_key`: The Azure OpenAI API key. Can be set with `AZURE_OPENAI_API_KEY` env var.  <br /> <br />`azure_ad_token`: Microsoft Entra ID token. Can be set with `AZURE_OPENAI_AD_TOKEN` env var. |
| **Mandatory run variables** | `messages`: A list of [`ChatMessage`](../../concepts/data-classes/chatmessage.mdx)  objects representing the chat |
| **Output variables** | `replies`: A list of alternative replies of the LLM to the input chat |
| **API reference** | [Generators](/reference/generators-api) |
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/chat/azure.py |

</div>

## Overview

`AzureOpenAIChatGenerator` supports OpenAI models deployed through Azure services. To see the list of supported models, head over to Azure [documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?source=recommendations). The default model used with the component is `gpt-4o-mini`.

To work with Azure components, you will need an Azure OpenAI API key, as well as an Azure OpenAI Endpoint. You can learn more about them in Azure [documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference).

The component uses `AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_AD_TOKEN` environment variables by default. Otherwise, you can pass `api_key` and `azure_ad_token` at initialization:

```python
client = AzureOpenAIChatGenerator(azure_endpoint="<Your Azure endpoint e.g. `https://your-company.azure.openai.com/>",
                        api_key=Secret.from_token("<your-api-key>"),
                        azure_deployment="<a model name>")
```

:::note
We recommend using environment variables instead of initialization parameters.

:::

Then, the component needs a list of `ChatMessage` objects to operate. `ChatMessage` is a data class that contains a message, a role (who generated the message, such as `user`, `assistant`, `system`, `function`), and optional metadata. See the [usage](https://www.notion.so/AzureOpenAIChatGenerator-c20636ac8b914ab798439a5f7a273ff0?pvs=21) section for an example.

You can pass any chat completion parameters that are valid for the `openai.ChatCompletion.create` method directly to `AzureOpenAIChatGenerator` using the `generation_kwargs` parameter, both at initialization and to `run()` method. For more details on the supported parameters, refer to the [Azure documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference).

You can also specify a model for this component through the `azure_deployment` init parameter.

### Structured Output

`AzureOpenAIChatGenerator` supports structured output generation, allowing you to receive responses in a predictable format. You can use Pydantic models or JSON schemas to define the structure of the output through the `response_format` parameter in `generation_kwargs`.

This is useful when you need to extract structured data from text or generate responses that match a specific format.

```python
from pydantic import BaseModel
from haystack.components.generators.chat import AzureOpenAIChatGenerator
from haystack.dataclasses import ChatMessage

class NobelPrizeInfo(BaseModel):
    recipient_name: str
    award_year: int
    category: str
    achievement_description: str
    nationality: str

client = AzureOpenAIChatGenerator(
    azure_endpoint="<Your Azure endpoint>",
    azure_deployment="gpt-4o",
    generation_kwargs={"response_format": NobelPrizeInfo}
)

response = client.run(messages=[
    ChatMessage.from_user(
        "In 2021, American scientist David Julius received the Nobel Prize in"
        " Physiology or Medicine for his groundbreaking discoveries on how the human body"
        " senses temperature and touch."
    )
])
print(response["replies"][0].text)

```

:::note Model Compatibility and Limitations

- Pydantic models and JSON schemas are supported for latest models starting from GPT-4o.
- Older models only support basic JSON mode through `{"type": "json_object"}`. For details, see [OpenAI JSON mode documentation](https://platform.openai.com/docs/guides/structured-outputs#json-mode).
- Streaming limitation: When using streaming with structured outputs, you must provide a JSON schema instead of a Pydantic model for `response_format`.
- For complete information, check the [Azure OpenAI Structured Outputs documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/structured-outputs).
:::

### Streaming

You can stream output as it’s generated. Pass a callback to `streaming_callback`. Use the built-in `print_streaming_chunk` to print text tokens and tool events (tool calls and tool results).

```python
from haystack.components.generators.utils import print_streaming_chunk

## Configure any `Generator` or `ChatGenerator` with a streaming callback
component = SomeGeneratorOrChatGenerator(streaming_callback=print_streaming_chunk)

## If this is a `ChatGenerator`, pass a list of messages:
## from haystack.dataclasses import ChatMessage
## component.run([ChatMessage.from_user("Your question here")])

## If this is a (non-chat) `Generator`, pass a prompt:
## component.run({"prompt": "Your prompt here"})
```

:::note
Streaming works only with a single response. If a provider supports multiple candidates, set `n=1`.

:::

See our [Streaming Support](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) docs to learn more how `StreamingChunk` works and how to write a custom callback.

Give preference to `print_streaming_chunk` by default. Write a custom callback only if you need a specific transport (for example, SSE/WebSocket) or custom UI formatting.

## Usage

### On its own

Basic usage:

```python
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import AzureOpenAIChatGenerator
client = AzureOpenAIChatGenerator()
response = client.run(
	  [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
)
print(response)
```

With streaming:

```python
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import AzureOpenAIChatGenerator
client = AzureOpenAIChatGenerator(streaming_callback=lambda chunk: print(chunk.content, end="", flush=True))
response = client.run(
	  [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
)
print(response)
```

### In a pipeline

```python
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import AzureOpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline

## no parameter init, we don't use any runtime template variables
prompt_builder = ChatPromptBuilder()
llm = AzureOpenAIChatGenerator()

pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")
location = "Berlin"
messages = [ChatMessage.from_system("Always respond in German even if some input data is in other languages."),
            ChatMessage.from_user("Tell me about {{location}}")]
pipe.run(data={"prompt_builder": {"template_variables":{"location": location}, "template": messages}})
```
